KMP (Knuth-Morris-Pratt) String Matching Demo

Efficient pattern search using the Longest Proper Prefix (LPS) array.

šŸ“ Problem Statement & Input

Problem Statement:

Lily, a software engineer, needs to find all occurrences of a specific pattern in a larger text for her text analysis project. Given two strings $\text{txt}$ (the text) and $\text{pat}$ (the pattern) of sizes $N$ and $M$ respectively, where $N > M$, help Lily by printing all starting indices of the pattern in the text using the KMP algorithm.

**Input:** Text ($\text{txt}$) and Pattern ($\text{pat}$).

**Output:** The starting indices of all occurrences of the pattern in the text (0-based indexing).

Input Section

Initial Visualization:

Text:

Pattern:

šŸ’” LPS Array Computation (Preprocessing)

LPS Definition:

The $\text{LPS}[i]$ value is the length of the longest proper prefix of $\text{pat}[0 \dots i]$ that is also a suffix of $\text{pat}[0 \dots i]$. This array dictates the optimal shift distance upon a mismatch.

Pattern ($\text{pat}$) and its LPS Array ($\pi$):

Indices (k)
Pattern (pat)
LPS Array ($\pi[k]$)

LPS Calculation Summary:

LPS array will be calculated upon loading the input.

šŸŽ¬ Step-by-Step Pattern Search

Preprocessing complete. Ready to begin search.

Search Visualization

Text (txt)
Pattern (pat)

**Pointers:** Text Index ($i$) = ?, Pattern Index ($j$) = ?

Output Log


            

Found Pattern Indices (0-based): None

šŸ“Š Algorithm Analysis

Time Complexity

O(N + M)

The **LPS array preprocessing** takes $O(M)$ time. The **search phase** takes $O(N)$ time. Since $N > M$, the overall complexity is $O(N+M)$, which is optimal.

Space Complexity

O(M)

Space is required only to store the $\text{LPS}$ array, which has a size equal to the length of the pattern ($M$).

Why KMP is Superior

The Naive algorithm's worst-case time complexity is $O(N \cdot M)$. KMP avoids this by intelligently sliding the pattern based on the pre-computed $\text{LPS}$ array, never comparing the same text character twice. This is particularly crucial for strings with repeating patterns, like $T=\text{AAAAAB}$ and $P=\text{AAAAB}$.